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Directed Evolution of Oxidase Enzymes 



This invention is concerned with the production of modified enzymes, particularly 
oxidase enzymes, and more particularly galactose oxidase enzymes. Recombinant techniques 
such as directed evolution are used to obtain polynucleotide and polypeptide products having 
5 desirable properties. Galactose oxidase variants with increased activity and increased 

thermostability relative to the wild-type enzyme are described. 



BACKGROUND OF THE INVENTION 



10 An ''oxidation enzyme" is an enzyme that catalyzes one or more oxidation reactions, 

typically by adding, inserting, contributing or transferring oxygen from a source or donor to a 
substrate. Such enzymes are also called oxidoreductases or redox enzymes, and encompasses 
oxygenases, hydrogenases or reductases, oxidases and peroxidases. One such enzyme is 
galactose oxidase. This invention relates to the selection and production of polynucleotides that 

1 5 encode polypeptides or proteins with biological activity as oxidation enzymes, and in particular 

galactose oxidase enzymes. These enzymes are produced in facile expression systems such as 
robust prokaryotic cells {e.g, bacteria) and eukaryotic systems {e.g. fungi and yeast). 



20 



Field of the Invention 

The invention concerns the recombinant production of functional eukaryotic proteins by 
host cells, in high yield, with increased activity, and/or with increased stability, e,g. 
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thermostability. Preferred proteins of the invention include oxidase enzymes (oxidases) such as 
polypeptides evolved from galactose oxidase (D-galactose:oxygen 6-oxidoreductase or GAO, 
EC 1 . 1 .3 .9). Polynucleotides which encode and express these proteins in recombinant host cell 
expression systems, and the resulting polypeptides, are encompassed by the invention. 

The publications and reference materials noted herein and listed in the appended 
Bibliography are each incorporated by reference in their entirety. They are referenced 
numerically in the text and the Bibliography below. 

Production of Enzyme Variants 

Many proteins of interest are produced by organisms having "eukaryotic" cells. These 
are cells having a nucleus surrounded by its own membrane and containing DNA on structures 
called chromosomes. All multicellular organisms, such as humans and animals, and many single- 
cell animals, have eukaryotic cells. Other single-cell organisms, such as bacteria have 
"prokaryotic" cells. These cells have a primitive nucleus with DNA in a defined structure, but 
without chromosomes and a nuclear membrane that is characteristic of eukaryotes. Prokaryotic 
organisms are generally much easier and less costly to grow, maintain and manipulate than 
eukaryotic cells. 

Genetic engineering and recombinant DNA and RN A technologies have made it possible 
to produce proteins, hormones and enzymes that are native to one organism, by using the cells 
of a different organism as 'Tactories" or host cell expression systems. In particular, it is often 
desirable to express a protein of eukaryotic origin in a prokaryotic host cell, because the 
prokaryotes can be grown in large quantities of identical cells, to produce large amounts of the 
desired foreign protein. For example, certain human proteins may be useful as drugs if they can 
be supplied in sufficient quantity to patients who have a protein deficiency. Such proteins may 
not easily or ethically be obtained by isolating them from human cells, nor can they easily be 
made by direct chemical synthesis or by growing them in isolated tissue cultures. Other proteins 
and enzymes are useful in industry. For example, certain enzymes can break down food 
products, and are usefial in laundry detergent. However, commercial applications require large 
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amounts of protein and a high degree of quality control. Desirable applications also require or 
would benefit from more active or more thermostable (heat resistant) proteins or enzymes. 

To solve some of these problems, recombinant genetic engineering techniques have been 
developed to use genetic machinery of other cells, such as bacteria and yeast, to produce human 
or other proteins. Selected genetic material, such as a polynucleotide that encodes a desired 
protein, is "recombined" with genetic material in a host cell, so that the host cell expresses the 
introduced foreign genetic material and produces the desired polypeptide or protein. Bacteria, 
fungi and and yeast can be suitable host cells because they are easy and economical to grow and 
maintain in large quantities, and can be used to reliably and repeatably produce foreign proteins. 
Some proteins that are made by cells can be secreted or delivered outside the cell, which can 
improve the yield and the efficiency of subsequent isolation and purification steps. 

Directed evolution has been successfully applied to improve a variety of enzyme 
properties, such as substrate specificity, activity in organic solvents, and stability at high 
temperatures, which are often critical for industrial applications (5). This evolutionary approach 
uses DNA shuffling, for simultaneous random mutagenesis and recombination, to generate a 
variant having an improved desirable property over the existing wild type protein. Point 
mutations are generated due to the intrinsic infidelity of Taq-based polymerase chain reactions 
(PGR) associated with reassembly of nucleic acid sequences. In one example, Stemmer and 
coworkers applied this technique to the gene encoding for green fluorescence protein (GFP), 
which resulted in a protein that folded better than the wild type in E. coli (10). Other examples 
are in the literature. (11-18, 21-25, 27-34, 47-58, 60-63, 65-75). Eukaryotic enzymes have a 
myriad of existing and potential applications, but improvement of these and other proteins by 
directed evolution is desirable. For example, the difficulty of expressing certain oxidase enzymes 
in a facile expression host has posed technical challenges. Efforts to modify these enzymes for 
industrial applications by protein engineering methods have been impeded. Directed evolution, 
for example, exploits expression in a host such as E. coli or S, cerevisiae, organisms in which 
large libraries of mutants or variants can be made. Also, the lack of efficient expression in an 
appropriate foreign (heterologous) host can prevent the mass production of some of these 
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proteins on an economical scale Thus, there continues to be a need for new ways to produce 
new proteins, and for new proteins and enzymes having new or enhanced biological properties. 

Galactose Oxidase Enzymes 

One protein of interest is the oxidation enzyme galactose oxidase. Galactose oxidase (D- 
galactose: oxygen 6-oxidoreductase, GAO; EC M .3 .9) is an enzyme containing a single copper 
ion, and is secreted by a number of fiangal species, Fusahum NRRL 2903, formerly known as 
Dactylmm deudroides, has been the most extensively studied (76), The enzyme is a glycoprotein 
with a carbohydrate content of about 1.7% and consists of a single polypeptide chain of 639 
amino acid residues with molecular mass of 68,000 Da (77, 78). The reaction catalyzed by GAO 
is the oxidation of priman/ alcohols to the corresponding aldehydes, coupled to the two-electron 
reduction of O2 to hydrogen peroxide (79). 

The enzyme oxidizes an unusually broad range of substrates. It accepts D-galactose 
(FIG. 1), alpha- and beta-galactopyranosides, oligo- and polysaccharides and considerably 
smaller molecules, such as glycerol and allyl alcohol, as substrates (77, 80-82). GAO exhibits 
prochiral (only the pro-S hydrogen is abstracted) as well as enantiomeric specificity for galactose 
(only D-galactose is oxidized by the enzyme) (80, 83). Furthermore, GAO strictly discriminates 
against D-glucose, the C-4 epimer of D-galactose, as a substrate or ligand. D-glucose does not 
bind to GAO at concentrations as high as 1 M (80, 84). The kinetic parameters of GAO for the 
oxidation of galactose are: Km = 67 mM, kcat = 3,000 sec'l, kcat/Km = 45xl03 M-^sec-l (85). 

The crystal structure of GAO has been reported (86). It consists of three predominantly 
beta- structure domains. The copper ion lies on the solvent-accessible surface of the second and 
largest domain (residues 156-532) (78, 87). Tyr-272, Tyr-495, His-496, His-581 and a water 
molecule are the copper ligands at pH 7.0. The crystal structure also reveals a novel thioether 
bond linking Cys-228 and Tyr-272 and supports the presence of a tyrosine free radical at the 
active site (79). The active site structure of GAO is shown in FIG. 2. Site-directed mutagenesis 
of Tyr-495 and Cys-228 have confirmed their involvement in catalysis (85, 88). 

GAO is useful in a wide variety of applications, ranging from analytical and food 
chemistry to chemoenzymatic synthesis and clinical testing. For example, biological sensors 
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based on GAG have been developed to determine the content of galactose (89), lactose and 
other GAO substrates (90). Such biosensors have also been used for quality control in dairy 
industries (91, 92), online bioprocess monitoring (93) and analysis of blood samples of patients 
with suspected galactosemia (94). The stereospecificity and broad substrate specificity of GAO 
have been exploited in the chemoenzymatic synthesis of L-sugars from polyols (95), which are 
usually difficult to prepare by chemical methods (96, 97), as well as sugar-containing polyamines 
(98) and 5-C-(hydroxymethyl)hexoses (99). GAO applications in synthesis have been limited due 
to its relatively low activity toward a large number of primary alcohols (100). Additionally, 
GAO is also used for the detection of the disaccharide D-galactose-beta-(l->3)-N- 
acetylgalactosamine (Gal-GalNAc), a tumor marker in colonic cancer and precancer, and 
provides a cost-effective screening test for patients with neoplasia or at the risk of developing 
neoplasia (101, 1 02). GAO finds applications in food chemistry. For example, it has been used 
in oxidized guar manufacture (103) and to treat the oligosaccharide fraction contained in honey 
(104). Finally, GAO is used to oxidize the cell surface polysaccharides of membrane-bound 
glycoproteins containing terminal non-reducing galactose residues: this is an essential step in the 
successful radiolabeling of these glycoconjugates (105, 106). 

Modified and particulariy improved or optimized GAO enzymes are useful to improve 
and expand the use of the enzyme in practical applications. For example, enzymes of the 
invention include GAO variants that are more active, more thermostable, or both. Increased 
activity and/or expression as well as high thermostability may significantly decrease the cost of 
enzyme production, simplify its purification and handling, and prolong its shelf-life. Other 
properties of the enzyme may also be varied, for example to optimize activity towards particular 
substrates or toward other substrates such as polymeric materials and glucose. 

Use of these evolved enzymes in biosensors and diagnostics can increase sensitivity, 
decrease the response time and enhance the detection range. In addition, a more stable enzyme 
will find applications in the construction of biosensors with prolonged stability. An evolved 
GAO with improved activity toward poor GAO substrates, such as allyl alcohol and glucose, will 
provide new and improved applications of the enzyme in organic synthesis and other sensor 
applications. For chemical synthesis applications, selective oxidation of alcohols to the 
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corresponding aldehydes avoids the use of protecting groups, minimizes side reactions often 
observed in traditional chemical synthesis, and is an environmentally friendly process. Use of 
such GAO enzymes as a synthetic reagent would facilitate the use of more inexpensive, safe and 
biodegradable carbohydrate materials m industrial processes (107). 

A more eflficient enzyme is expected to be advantageous in the food chemistry 
applications of GAO, and, in particular in the selective modification of guar and other 
carbohydrate-based polymers. GAO variants according to the invention would also be useful 
for modification of carbohydrate-based (e.g. cellulosic) textiles and other materials. The 
aldehyde ftjnction produced by the GAO can be used to couple other substances selectively at 
the modified position on the polymer. 

Accordingly, there is a need to develop new and improved GAO enzymes, as well as 
methods for expressing such proteins. In particular, there is a need for protein expression 
methods which are well-suited for use in connection with directed evolution techniques. 

This invention describes methods for screening libraries of GAO mutants produced by 
error-prone PGR and DNA shuffling, to identify mutations that are expressed in bacteria (e.g. 
E. coli) and with improved GAO function. Micro-plate and membrane screening techniques are 
disclosed. In one embodiment, the mutant is a functional and active galactose oxidase (GAO) 
that is expressed in E. coli at levels of about 65 times the activity of a parent recombinant wild 
type (for D-galactose). The activity for other substrates, such as ally! alcohol, is also about 65 
times that of wild type. Mutants of the invention can have any fraction or multiple of the 
corresponding wild type activity, but preferably are more active, e,g. about 2 to 200 times as 
active. Mutants also are more thermostable. Enzyme yield is generally at least about 10 mg/1. 

SUMMARY OF THE INVENTION 

The observed constraints on the use of native proteins are thought to be a consequence 
of evolution. Proteins have evolved in the context and environment of a living organism, to carry 
out specific biological functions under conditions conducive to life - not in the laboratory or 
under industrial conditions. In some cases, evolution may favor or even require less than 
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optimally efficient enzymes. The output, efficiency, working conditions, stability and other 
properties of known expression systems are not thought to be unalterable, nor are they 
limitations which should be seen as intrinsic to the nature of cellular expression systems. It is 
possible that the proteins used in these systems can be evolved /// vitro, or that analogous 
proteins can be otherwise developed, to alter or enhance the protein's properties, for example, 
to obtain much more efficient expression, activity and thermostability. Improved proteins can 
also be obtained by screening cultures of native organisms or expressed gene libraries (3). 

The invention provides a method for improving the expression, thermostability, and/or 
the activity toward one or more substrates, of a polynucleotide encoding oxidase enzymes by 
using directed evolution. The invention also provides polynucleotides encoding for variant 
oxidase enzymes which have improved properties in conventional expression systems. 
According to one embodiment of the invention, directed evolution or random mutagenesis is 
used to produce GAO variants which are more highly expressed, more active, and/or more 
thermostable in prokaryotic expression systems such as E. coll. 

The above features and many other attendant advantages of the invention will become 
better understood by reference to the following detailed description when taken in conjunction 
with the accompanying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 shows a reaction scheme in which a D-galactose substrate is oxidized to produce 
a D-galactohexodialdose product, in the presence of galactose oxidase (GAO) enzyme. 
FIG. 2 shows the active site structure of GAO pH 7.0 

FIG. 3 is a graph showing the effect of metal ions (particularly copper ions) on the 
activity of a recombinant wild-type GAO, pGAO-010. Enzyme solutions with additives were 
kept at 4 °C for 1 hr before assay. Relative activity of enzyme solution with 1 mM copper (II) 
sulfate was estimated as 100 %. 
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FIG. 4 is a graph showing GAO activity for various clones generated by error-prone 
PCR, with varying concentrations of MnCU, using conditions A of TABLE 3. 

FIG. 5 is a graph showing GAO activity for various clones generated by error-prone 
PCR, with varying concentrations of MnCl, using conditions C of TABLE 3. 
5 FIG. 6 shows the sequences of PCR primers used herein for amplification, e.g. of the 

whole galactose oxidase gene. 

FIG. 7 is a schematic representation of the construction of plasmid pUC18-EHL. 

FIG. 8 is a schematic representation of the constaiction of plasmid pGAO-010. 

FIG. 9 is a schematic representation of the construction of plasmids pGAO-027 and 
10 pGAO-036. 

FIG. 10 is a schematic representation of the construction of plasmids pGAO-006 and 
pGAO-011. 

FIG. 11 shows the structures and activities of representative plasmids encoding GAO 
according to the invention, with IPTG-induced expression in host E. coli. Permeable cells which 
1 5 were treated by freeze (-20 °C), thaw (4 ""C) and 0. 5 mg/1 lysozyme for 3 0 minutes at 3 7 °C were 

used for assay. Activities given as * indicates that cells did not grow in test tube culture; 
**indicates that a transformant was not obtained. 

FIG. 12 shows a scheme for the design of plasmids according to the invention. 

FIG. 13 shows the structures and activities of additional plasmids encoding GAO 
20 according to the invention, with IPTG-induced expression in host £. coli. 

FIG. 14 is a graph comparing the GAO activities of GAO plasmids with and without 
random codon alternation. 

FIG. 1 5 shows substrate specificities for a wild type galactose oxidase and a recombinant 
galactose oxidase enzyme of the invention. Partially purified galactose oxidase from D. 
25 dendroides (Sigma) and cell-free extract from E. coli BL21(DE3)/pGAO-010 were used. 

Relative activities for D-galactose were estimated as 100 %. (+) indicates that oxidation was 
detected, but activities were too low to be estimated, n.d. indicates that activities were not 
distinguishable from background absorbance levels. 
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FIG. 16 is a graph showing the thermal stability of selected GAO mutants. 
; --O.' FIGS\ 17A-C show the sequence of representative mutant 9. 16.8D2 of the invention 
^ {SKQ. ID No\oi 

C ^ ^ 1^-, FIGS. 18A-C show the sequence of representative mutant 9. 16.6C1 1 of the invention 
5 fSKQ. IDn\i11 

'fi F^S. 19A-C show the sequence of representative mutant 9. 1 6. 1 6D 1 2 of the invention 
jSEQ. ID IVO. 12] 

J^JU "^IGSv 20A-C show the sequence of representative mutant 1 1.03 .6D3 of the invention 
|SEQ. IDNO^Sl 

1 0 ^i^^^l^^FIG^ 21 A-C show the sequence of representative mutant 1 1 .03. 10C3 of the invention 

fSEQ. rDNo\l4] 

c ^1^-^ FIGsL 22 A-C show the sequence of representative mutant 11. 03.1 0D6 of the invention 
^[^tQ. ID No\5] 

l^^^ij 7 FIG!^23A-C show the sequence of representative mutant 1 1 .03 . 1 3E 12 of the invention 
15 [iS^EQ. ID N0\^16] 

^^^'^^~^FIC^. 24A-C show the sequence of representative mutant 1.06.20E7 of the invention 
(SEQ. ID NC^. 17] 

^jyji^ 0^ y F*^S. 25A-C show the sequence of representative mutant 1 .D4 of the invention [SEQ. 

j6 no. m 

20 ^j^jj^ ^i^S. 26A-C show the sequence of representative mutant 2G4 of the invention [SEQ. 

y lD NO. 191 

^j^^jy^ O}'^^ f\gS. 27A-C show the sequence of representative mutant 3 .H7 of the invention [SEQ. 
0 NO. 201 

^^jjJi/- ■''^^ i^GS. 28A-C show the sequence of representative mutant 4.F 1 2 of the invention [SEQ. 
25 l6 NO. 2W 
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DETAILED DESCRIPTION OF THE INVENTION 



This invention concerns methods for improving the expression, activity and/or 
thermostability of proteins using facile or conventional expression systems. 

Definitions 

As used herein, "about" or ''approximately" shall mean within 20 percent, preferably 
within 10 percent, and more preferably within 5 percent of a given value or range. 

The term "substrate" means any substance or compound that is converted or meant to 
be converted into another compound by the action of an enzyme catalyst. The term includes 
aromatic and aliphatic compounds, and includes not only a single compound, but also 
combinations of compounds, such as solutions, mixtures and other materials which contain at 
least one substrate. 

An "oxidation reaction" or "oxygenation reaction", as used herein, is a chemical or 
biochemical reaction involving the addition of oxygen to a substrate, to form an oxygenated or 
oxidized substrate or product. An oxidation reaction is typically accompanied by a reduction 
reaction (hence the term "redox" reaction, for oxidation and reduction). A compound is 
"oxidized" when it receives oxygen or loses electrons. A compound is "reduced" when it loses 
oxygen or gains electrons. GAG typically catalyzes the oxidation of a primary alcohol group 
to an aldehyde. 

The term "enzyme" means any substance composed wholly or largely of protein or 
polypeptides that catalyzes or promotes, more or less specifically, one or more chemical or 
biochemical reactions. 

A "polypeptide" (one or more peptides) is a chain of chemical building blocks called 
amino acids that are linked together by chemical bonds called peptide bonds. A protein or 
polypeptide, including an enzyme, may be "native" or "wild-type", meaning that it occurs in 
nature or has the amino acid sequence of a native protein, respectively. These terms are 
sometimes used interchangeably. A polypeptide may or may not be glycosylated. A 
"recombinant wild-type" typically means the wild type sequence in a recombinant host without 
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glycosylation. Comparisons in the examples and figures of this application are generally with 
reference to a wild type that is a recombinant wild type. A polypeptide may also be a "mutant'', 
"variant" or "modified", meaning that it has been made, altered, derived, or is in some way 
different or changed from a native protein, or from another mutant. A native wild type protein 
comprises the natural sequence of amino acids in the polypeptide and typically includes 
glycosylation. A "parenf polypeptide or enzyme is any polypeptide or enzyme from which any 
other polypeptide or enzyme is derived or made, using any methods, tools or techniques, and 
whether or not the parent is itself a native or mutant polypeptide or enzyme. A parent 
polynucleotide is one that encodes a parent polypeptide. A ''test enzyme" is a protein-containing 
substance that is tested to determine whether it has properties of an enzyme. The term "enzyme" 
can also refer to a catalytic polynucleotide (e.g. RNA or DNA). 

The "activity" of an enzyme is a measure of its ability to catalyze a reaction, and may be 
expressed as the rate at which the product of the reaction is produced. For example, enzyme 
activity can be represented as the amount of product produced per unit of time, per unit (e.g, 
concentration or weight) of enzyme. The "stability" of an enzyme means its ability to function, 
over time, in a particular environment or under particular conditions. One way to evaluate 
stability is to assess its ability to resist a loss of activity over time, under given conditions. 
Enzyme stability can also be evaluated in other ways, for example, by determining the relative 
degree to which the enzyme is in a folded or unfolded state. Thus, one enzyme is more stable 
than another, or has improved stability, when it is more resistant than the other enzyme to a loss 
of activity under the same conditions, is more resistant to unfolding, or is more durable by any 
suitable measure. For example, a more "thermally stable" or "thermostable" enzyme is one that 
is more resistant to loss of structure (unfolding) or function (enzyme activity) when exposed to 
heat or an elevated temperature. One way to evaluate this is to determine the "melting 
temperature" or T^ for the protein. The melting temperature, also called a midpoint, is the 
temperature at which half of the protein is unfolded from its fully folded state. This midpoint is 
typically determined by calculating the midpoint of a titration curve that plots protein unfolding 
as a function of temperature. Thus, a protein with a higher T^ requires more heat to cause 
unfolding and is more stable or more thermostable. Stated another way, a protein with a higher 
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T„^ indicates that fewer molecules of that protein are unfolded at the same temperature as a 
protein with a lower T„^, again meaning that the protein which is more resistant to unfolding is 
more stable (it has less unfolding at the same temperature). Another measure of stability is Tj .^ 
or Tgt), which is the transition midpoint of the inactivation curve of the protein as a ftinction of 
temperature. T^ ^ is the temperature at which the protein loses half of its activity. Thus, a 
protein with a higher Tj ^ requires more heat to deactivate it, and is more stable or more 
thermostable. Stated another way, a protein with a higher indicates that fewer molecules of 
that protein are inactive at the same temperature as a protein with a lower T1/2, again meaning 
that the protein which is more resistant to deactivation is more stable (it has more activity at the 
same temperature). These assays are also called "thermal shift" assays, because the inactivation 
or unfolding curve, plotted against temperature, is "shifted" to higher or lower temperatures 
when stability increases or decreases. Thermostability can also be measured in other ways. For 
example, a longer half-life (ti/2) for the enzyme's activity at elevated temperature is an indication 
of thermostability. 

An "oxidation enzyme" is an enzyme that catalyzes one or more oxidation reactions, 
typically by adding, inserting, contributing or transferring oxygen from a source or donor to a 
substrate. Such enzymes are also called oxidoreductases or redox enzymes, and encompasses 
oxygenases, hydrogenases or reductases, oxidases and peroxidases. 

The terms "oxygen donor", "oxidizing agent" and "oxidant" mean a substance, molecule 
or compound which donates oxygen to a substrate in an oxidation reaction. Typically, the 
oxygen donor is reduced (accepts electrons). Exemplary oxygen donors, which are not limiting, 
include molecular oxygen or dioxygen (O2) and peroxides, including alkyl peroxides such as t- 
butyl peroxide, and most preferably hydrogen peroxide (H2O2). A peroxide is any compound 
having two oxygen atoms bound to each other. 

A "luminescent" substance means any substance which produces detectable 
electromagnetic radiation, or a change in electromagnetic radiation, most notably visible light, 
by any mechanism, including color change, UV absorbance, fluorescence and phosphorescence. 
Preferably, a luminescent substance according to the invention produces a detectable color, 
fluorescence or UV absorbance. The term "chemiluminescent agent" means any luminescent 
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substance which enhances the detectabihty of a luminescent {t\^. , fluorescent) signal, for example 
by increasing the strength or lifetime of the signal One exemplary and preferred 
chemiluminescent agent is azinobis(3-ethylbenzothiazoline-6-sulfonic acid) (ABTS). 5-amino- 
2,3-dihydro- 1 ,4-phthalazinedione (luminol) and analogs. Others include 5-amino-2,3-dihydro- 
5 1,4-phthalazinedione (luminol) and analogs, 1,2-dioxetanes such as tetramethyl-l,2-dioxetane 

(TMD), 1 ,2-dioxetanones, and 1,2-dioxetanediones, o-anisidine, c>>-dianisidine, and o-tolidine. 
Another term for these kinds of materials is ''chromogen/' 

The term ''polymer ' means any substance or compound that is composed of two or more 
building blocks ('mers') that are repetitively linked to each other. For example, a "dimer" is a 
10 compound in which two building blocks have been joined together. 

The term "cofactor" means any non-protein substance that is necessary or beneficial to 
the activity of an enzyme. A "coenzyme" means a cofactor that interacts directly with and serves 
to promote a reaction catalyzed by an enzyme. Many coenzymes serve as carriers. For example, 
NAD^ and NADP^ carry hydrogen atoms from one enzyme to another. An "ancillary protein" 
1 5 means any protein substance that is necessary or beneficial to the activity of an enzyme. 

The term "host celf means any cell of any organism that is selected, modified, 
transformed, grown, or used or manipulated in any way, for the production of a substance by 
the cell, for example the expression by the cell of a gene, a DN A or RNA sequence, a protein or 
an enzyme. 

20 "DNA" (deoxyribonucleic acid) means any chain or sequence of the chemical building 

blocks adenine (A), guanine (G), cytosine (C) and thymine (T), called nucleotide bases, that are 
linked together on a deoxyribose sugar backbone. DN A can have one strand of nucleotide bases, 
or two complimentary strands which may form a double helix structure. "RNA" (ribonucleic 
acid) means any chain or sequence of the chemical building blocks adenine (A), guanine (G), 

25 cytosine (C) and uracil (U), called nucleotide bases, that are linked together on a ribose sugar 

backbone. RNA typically has one strand of nucleotide bases. 

A "polynucleotide" or "nucleotide sequence" is a series of nucleotide bases (also called 
"nucleotides") in DN A and RNA, and means any chain of two or more nucleotides. A nucleotide 
sequence typically carries genetic information, including the information used by cellular 
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machinery to make proteins and enzymes. These terms include double or single stranded 
genomic and cDNA, RN A, any synthetic and genetically manipulated polynucleotide, and both 
sense and anti-sense polynucleotide (although only sense stands are being represented herein). 
This includes single- and double-stranded molecules, i.e., DNA-DNA, DNA-RNA and RNA- 
5 RNA hybrids, as well as "protein nucleic acids" (PNA) formed by conjugating bases to an amino 

acid backbone. This also includes nucleic acids containing modified bases, for example thio- 
uracil, thio-guanine and fluoro-uracil. 

The polynucleotides herein may be flanked by natural regulatory sequences, or may be 
associated with heterologous sequences, including promoters, enhancers, response elements, 

1 0 signal sequences, polyadenylation sequences, introns, 5'- and 3'- non-coding regions, and the like. 

The nucleic acids may also be modified by many means known in the art. Non-limiting examples 
of such modifications include methylation, "caps", substitution of one or more of the naturally 
occurring nucleotides with an analog, and internucleotide modifications such as, for example, 
those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoroamidates, 

1 5 carbamates, etc.) and with charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.). 

Polynucleotides may contain one or more additional covalently linked moieties, such as, for 
example, proteins {e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine, etc.), 
intercalators {e.g., acridine, psoralen, etc.), chelators {e.g., metals, radioactive metals, iron, 
oxidative metals, etc.), and alkylators. The polynucleotides may be derivatized by formation of 

20 a methyl or ethyl phosphotriester or an alkyl phosphoramidate linkage. Furthermore, the 

polynucleotides herein may also be modified with a label capable of providing a detectable signal, 
either directly or indirectly. Exemplary labels include radioisotopes, fluorescent molecules, 
biotin, and the like. 

Proteins and enzymes are made in the host cell using instructions in DNA and RNA, 
25 according to the genetic code. Generally, a DNA sequence having instructions for a particular 

protein or enzyme is "transcribed" into a corresponding sequence of RNA. The RNA sequence 
in turn is "translated" into the sequence of amino acids which form the protein or enzyme. An 
"amino acid sequence" is any chain of two or more amino acids. Each amino acid is represented 
in DNA or RNA by one or more triplets of nucleotides. Each triplet forms a codon. 
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corresponding to an amino acid. For example, the amino acid lysine (Lys) can be coded by the 
nucleotide triplet or codon AAA or by the codon AAG (The genetic code has some 
redundancy, also called degeneracy, meaning that most amino acids have more than one 
corresponding codon. ) Because the nucleotides in DN A and RN A sequences are read in groups 
of three for protein production, it is important to begin reading the sequence at the correct amino 
acid, so that the correct triplets are read. The way that a nucleotide sequence is grouped into 
codons is called the "reading frame.'' 

The term ''gene", also called a "structural gene" means a DNA sequence that codes for 
or corresponds to a particular sequence of amino acids which comprise all or part of one or more 
proteins or enzymes, and may or may not include regulatory DNA sequences, such as promoter 
sequences, which determine for example the conditions under which the gene is expressed. 
Some genes, which are not structural genes, may be transcribed from DNA to RNA, but are not 
translated into an amino acid sequence. Other genes may function as regulators of structural 
genes or as regulators of DNA transcription. 

A "coding sequence" or a sequence "encoding" a polypeptide, protein or enzyme is a 
nucleotide sequence that, when expressed, results in the production of that polypeptide, protein 
or enzyme, i.e., the nucleotide sequence encodes an amino acid sequence for that polypeptide, 
protein or enzyme. A coding sequence is "under the control" of transcriptional and translational 
control sequences in a cell when RN A polymerase transcribes the coding sequence into mRNA, 
which is then trans-RN A spliced and translated into the protein encoded by the coding sequence. 
Preferably, the coding sequence is a double-stranded DNA sequence which is transcribed and 
translated into a polypeptide in a cell in vitro or in vivo when placed under the control of 
appropriate regulatory sequences. The boundaries of the coding sequence are determined by a 
start codon at the 5' (amino) terminus and a translation stop codon at the 3' (carboxyl) terminus. 
A coding sequence can include, but is not limited to, prokaryotic sequences, cDNA from 
eukaryotic mRNA, genomic DNA sequences from eukaryotic {e.g. , mammalian) DNA, and even 
synthetic DNA sequences. If the coding sequence is intended for expression in a eukaryotic cell, 
a polyadenylation signal and transcription termination sequence will usually be located 3' to the 
coding sequence. 
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Transcriptional and translational control sequences are DN A regulatory sequences, such 
as promoters, enhancers, terminators, and the like, that provide for the expression of a coding 
sequence in a host cell. In eukaryotic cells, polyadenylation signals are control sequences. 

A "promoter sequence" is a DNA regulatory region capable of binding RN A polymerase 
^ in a cell and initiating transcription of a downstream (3' direction) coding sequence. For 

purposes of defining this invention, the promoter sequence is bounded at its 3' terminus by the 
transcription initiation site and extends upstream (5' direction) to include the minimum number 
of bases or elements necessary to initiate transcription at levels detectable above background. 
Within the promoter sequence will be found a transcription initiation site (conveniently defined 

10 for example, by mapping with nuclease SI), as well as protein binding domains (consensus 

sequences) responsible for the binding of RNA polymerase. As described above, promoter DNA 
is a DNA sequence which initiates, regulates, or otherwise mediates or controls the expression 
of the coding DNA. A promoter may be "inducible", meaning that it is influenced by the 
presence or amount of another compound (an "inducer"). For example, an inducible promoter 

1 5 includes those which initiate or increase the expression of a downstream coding sequence in the 

presence of a particular inducer compound. A "leaky" inducible promoter is a promoter that 
provides a high expression level in the presence of an inducer compound and a comparatively 
very low expression level, and at minimum a detectable expression level, in the absence of the 
inducer. 

20 A "signal sequence" is included at the beginning of the coding sequence of a protein to 

be expressed in the periplasmic space, or outside the cell. This sequence encodes a signal 
peptide, N-terminal to the mature polypeptide, that directs the host cell to translocate the 
polypeptide. The term "translocation signal sequence" is also used to refer to a signal sequence. 
Translocation signal sequences can be found associated with a variety of proteins native to 

25 eukaryotes and prokaryotes, and are often functional in both types of organisms. Proteins of the 

invention may be further modified and improved by adding a sequence which directs the 
secretion of the protein outside the host cell. The addition of the signal sequence does not 
interfere with the folding of the secreted protein, and evidence thereof is easily tested for using 
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techniques known in the art and depending on the protein {e.g., tests for activity of a given 
protein after modification) 

The terms "express" and "expression" mean allowing or causing the information in a gene 
or DN A sequence to become manifest, for example producing a protein by activating the cellular 
fiinctions involved in transcription and translation of a corresponding gene or DNA sequence. 
A DNA sequence is expressed in or by a cell to form an "expression product" such as a protein. 
The expression product itself, e.}^. the resulting protein, may also be said to be "expressed" by 
the cell. A polynucleotide or polypeptide is expressed recombinantly, for example, when it is 
expressed or produced in a foreign host cell under the control of a foreign or native promoter, 
or in a native host cell under the control of a foreign promoter. 

A polynucleotide or polypeptide is "over-expressed" when it is expressed or produced 
in an amount or yield that is substantially higher than a given base-line yield, e.g. a yield that 
occurs in nature. For example, a polypeptide is over-expressed when the yield is substantially 
greater than the normal, average or base-line yield of the native polypolypeptide in native host 
cells under given conditions, for example conditions suitable to the life cycle of the native host 
cells. Over-expression of a polypeptide can be obtained, for example, by altering any one or 
more of (a) the grovv^h or living conditions of the host cells; (b) the polynucleotide encoding the 
polypeptide to be over-expressed; (c) the promoter used to control expression of the 
polynucleotide; and (d) the host cells themselves. This is a relative, and thus "over-expression" 
can also be used to compare or distinguish the expression level of one polypeptide to another, 
without regard for whether either polypeptide is a native polypeptide or is encoded by a native 
polynucleotide. Typically, over-expression means a yield that is at least about two times a 
normal, average or given base-line yield. Thus, a polypeptide is over-expressed when it is 
produced in an amount or yield that is substantially higher than the amount or yield of a parent 
polypeptide or under parent conditions. Likewise, a polypeptide is "under-expressed" when it 
is produced in an amount or yield that is substantially lower than the amount or yield of a parent 
polypeptide or under parent conditions, e.g. at least half the base-line yield. In this context, the 
expression level or yield refers to the amount or concentration of polynucleotide that is 
expressed, or polypeptide that is produced {i.e. expression product), whether or not in an active 
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or functional form. As one example, a polynucleotide or polypeptide may be said to be under- 
expressed when it is expressed in detectable amounts under the control of an inducible promoter, 
but without induction, i.e. in the absence of an inducer compound. 

An expression product can be characterized as intracellular, extracellular or secreted. The 
term "intracellular" means something that is mside a cell. The term "extracellular'' means 
something that is outside a cell. A substance is ''secreted'' by a cell if it delivered to the 
periplasm or outside the cell, from somewhere on or inside the cell. 

As used herein, the terms ''expression-resistant polypeptide" and "resistant to functional 
expression" are synonymous and refer to a polypeptide that is difficult to fijnctionally express 
in selected host cells. For example, an expression-resistant polypeptide is not produced, or is 
produced in very low yield or in non-functional form, when a polynucleotide encoding that 
polypeptide is transformed or introduced into host cells, e.g. into a facile host cell expression 
system. 

The term "transformation" means the introduction of a ''foreign" {i.e. extrinsic or 
extracellular) gene, DNA or RNA sequence to a host cell, so that the host cell will express the 
introduced gene or sequence to produce a desired substance, typically a protein or enzyme coded 
by the introduced gene or sequence. The introduced gene or sequence may also be called a 
"cloned" or "foreign" gene or sequence, may include regulatory or control sequences, such as 
start, stop, promoter, signal, secretion, or other sequences used by a cell's genetic machinery. 
The gene or sequence may include nonfunctional sequences or sequences with no known 
function. A host cell that receives and expresses introduced DNA or RNA has been 
"transformed" and is a "transformant" or a "clone." The DNA or RNA introduced to a host cell 
can come from any source, including cells of the same genus or species as the host cell, or cells 
of a different genus or species. 

The terms "vector", "cloning vector" and "expression vector" mean the vehicle by which 
a DNA or RNA sequence {e.g. a foreign gene) can be introduced into a host cell, so as to 
transform the host and promote expression {e.g. transcription and translation) of the introduced 
sequence. 
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Vectors typically comprise the DNA of a transmissible agent, into which foreign DNA 
is inserted. A common way to insert one segment of DNA into another segment of DNA 
involves the use of enzymes called restriction enzymes that cleave DNA at specific sites (specific 
groups of nucleotides) called restriction sites. Generally, foreign DNA is inserted at one or more 
restriction sites of the vector DNA, and then is carried by the vector into a host cell along with 
the transmissible vector DNA. A segment or sequence of DNA having inserted or added DNA, 
such as an expression vector, can also be called a "DNA construct." 

A common type of vector is a ''plasmid", which generally is a self-contained molecule 
of double-stranded DNA, that can readily accept additional (foreign) DNA and which can readily 
introduced into a suitable host cell A plasmid vector often contains coding DNA and promoter 
DNA and has one or more restriction sites suitable for inserting foreign DNA. Promoter DNA 
and coding DNA may be from the same gene or from different genes, and may be from the same 
or different organisms. A large number of vectors, including plasmid and fungal vectors, have 
been described for replication and/or expression in a variety of eukaryotic and prokaryotic hosts, 
Non-Hmiting examples include pKK plasmids (Clonetech), pUC plasmids, pET plasmids 
(Novagen, Inc., Madison, WI), pRSET or pREP plasmids (Invitrogen, San Diego, CA), or 
pMAL plasmids (New England Biolabs, Beverly, MA), and many appropriate host cells, using 
methods disclosed or cited herein or otherwise known to those skilled in the relevant art. 
Recombinant cloning vectors will often include one or more replication systems for cloning or 
expression, one or more markers for selection in the host, e.g. antibiotic resistance, and one or 
more expression cassettes. Routine experimentation in biotechnology can be used to determine 
which vectors are best suited for used with the invention. In general, the choice of vector 
depends on the size of the polynucleotide sequence and the host cell to be employed in the 
methods of this invention. 

A "cassette" refers to a segment of DNA that can be inserted into a vector at specific 
restriction sites. The segment of DNA encodes a polypeptide of interest, and the cassette and 
restriction sites are designed to ensure insertion of the cassette in the proper reading frame for 
transcription and translation. 



# 
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The term ''expression system'' means a host cell and compatible vector under suitable 
conditions, e.^^ for the expression of a protein coded for by foreign DNA carried by the vector 
and introduced to the host cell. Common expression systems include bacteria (e.g. K co/i and 
suhtilis) or yeast {e.g. S. cerevisiac) host cells and plasmid vectors, and insect host cells and 
5 Baculovirus vectors. As used herein, a "facile expression system" means any expression system 

that is foreign or heterologous to a selected polynucleotide or polypeptide, and which employs 
host cells that can be grown or maintained more advantageously than cells that are native or 
heterologous to the selected polynucleotide or polypeptide, or which can produce the 
polypeptide more efficiently or in higher yield. For example, the use of robust prokaryotic cells 
10 to express a protein of eukaryotic origin would be a facile expression system. Preferred facile 

expression systems include E, coli, B. suhtilis and .V. cerevisiae host cells and any suitable vector. 

The terms ''mutant" and ''mutation" mean any detectable change in genetic material, e.g. 
DNA, or any process, mechanism, or result of such a change. This includes gene mutations, in 
which the structure {e.g. DNA sequence) of a gene is altered, any gene or DNA arising from any 
1 5 mutation process, and any expression product {e.g. protein or enzyme) expressed by a modified 

gene or DNA sequence. The term "variant" may also be used to indicate a modified or altered 
gene, DNA sequence, enzyme, cell, etc., i.e., any kind of mutant. Such changes also include 
changes in the promoter, ribosome binding site, etc. 

"Sequence-conservative variants" of a polynucleotide sequence are those in which a 
20 change of one or more nucleotides in a given codon position results in no alteration in the amino 

acid encoded at that position. 

"Function-conservative variants" are those in which a given amino acid residue in a 
protein or enzyme has been changed without altering the overall conformation and function of 
the polypeptide, including, but not limited to, replacement of an amino acid with one having 
25 similar properties (such as, for example, acidic, basic, hydrophobic, and the like). Amino acids 

with similar properties are well known in the art. For example, arginine, histidine and lysine are 
hydrophilic-basic amino acids and may be interchangeable. Similariy, isoleucine, a hydrophobic 
amino acid, may be replaced with leucine, methionine or valine. Amino acids other than those 
indicated as conserved may differ in a protein or enzyme so that the percent protein or amino 
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acid sequence similarity between any two proteins of similar function may vary and may be, for 
example, from 70% to 99% as determined according to an alignment scheme such as by the 
Cluster Method, wherein similarity is based on the MEGALIGN algorithm. A "fijnction- 
conservative variant" also includes a polypeptide or enzyme which has at least 60 % amino acid 
identity as determined by BLAST or FASTA algorithms, preferably at least 75%, most preferably 
at least 85%, and even more preferably at least 90%, and which has the same or substantially 
similar properties or functions as the native or parent protein or enzyme to which it is compared. 

The term "DNA reassembly" is used when recombination occurs between identical 
sequences. "DNA shuffling" refers to a group of in vitro or />/ vivo methods involving 
recombination of nucleic acid species. For example, homologous recombination of pools of 
nucleic acid fragments or polynucleotides can be employed to generate polynucleotide molecules 
having variant sequences of the invention. Such methods can be employed to generate 
polynucleotide molecules having variant sequences of the invention. 

"Isolation" or "purification" of a polypeptide or enzyme refers to the derivation of the 
polypeptide by removing it from its original environment (for example, from its natural 
environment if it is naturally occurring, or form the host cell if it is produced by recombinant 
DNA methods). Methods for polypeptide purification are well-known in the art, including, 
without limitation, preparative disc-gel electrophoresis, isoelectric focusing, HPLC, reversed- 
phase HPLC, gel filtration, ion exchange and partition chromatography, and countercurrent 
distribution, For some purposes, it is preferable to produce the polypeptide in a recombinant 
system in which the protein contains an additional sequence tag that facilitates purification, such 
as, but not limited to, a polyhistidine sequence. The polypeptide can then be purified from a 
crude lysate of the host cell by chromatography on an appropriate solid-phase matrix. 
Alternatively, antibodies produced against the protein or against peptides derived therefrom can 
be used as purification reagents. Other purification methods are possible. A purified 
polynucleotide or polypeptide may contain less than about 50%, preferably less than about 75%, 
and most preferably less than about 90%, of the cellular components with which it was originally 
associated. A "substantially pure" enzyme indicates the highest degree of purity which can be 
achieved using conventional purification techniques known in the art. 
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Polynucleotides are "hybridizable" to each other when at least one strand of one 
polynucleotide can anneal to another polynucleotide under defined stringency conditions. 
Stringency of hybridization is determined, e.g., by a) the temperature at which hybridization 
and/or washing is performed, and b) the ionic strength and polarity (e.g., formamide) of the 
hybridization and washing solutions, as well as other parameters. Hybridization requires that the 
two polynucleotides contain substantially complementary sequences; depending on the stringency 
of hybridization, however, mismatches may be tolerated. Typically, hybridization of two 
sequences at high stringency (such as, for example, in an aqueous solution of 0.5X SSC at 65°C) 
requires that the sequences exhibit some high degree of complementarity over their entire 
sequence. Conditions of intermediate stringency (such as, for example, an aqueous solution of 
2X SSC at 65°C) and low stringency (such as, for example, an aqueous solution of 2X SSC at 
55°C), require correspondingly less overall complementarity between the hybridizing sequences. 
(IX SSC is 0.15 M NaCI, 0.015 M Na citrate.) Polynucleotides that "hybridize" to the 
polynucleotides herein may be of any length. In one embodiment, such polynucleotides are at 
least 10, preferably at least 15 and most preferably at least 20 nucleotides long. In another 
embodiment, polynucleotides that hybridizes are of about the same length. In another 
embodiment, polynucleotides that hybridize include those which anneal under suitable stringency 
conditions and which encode polypeptides or enzymes having the same function, such as the 
ability to catalyze an oxidation, oxygenase, or coupling reaction of the invention. 

The general genetic engineering tools and techniques discussed here, including 
transformation and expression, the use of host cells, vectors, expression systems, etc., are well 
known in the art. 



Mutagenesis and Directed Evolution of Proteins 

To improve the expression and function of proteins using conventional expression 
systems, the invention makes the unexpected discovery that directed evolution can be used to 
generate mutant libraries of polynucleotides which, when expressed using conventional or facile 
expression systems, result in functional proteins having increased activity and/or thermostability. 
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According to the invention, proteins that are expressed in facile gene expression systems 
can be obtained by using directed evolution to generate mutant polynucleotides in a library 
format for selection. General methods for generating libraries and isolating and identifying 
improved proteins (also described as "variants") according to the invention using directed 
evolution are described briefly below and more extensively, for example, in U.S. Patent Nos. 
5,741,691 and 5,81 1,238. See a/so. International Applications WO 98/42832, WO 95/22625, 
WO 97/20078, and WO 95/ and U.S Patents 5,605,793 and 5,830,721 (143, 149-156). It 
should be understood that any method for generating mutations in polynucleotide sequences to 
provide an evolved polynucleotide for use in expression systems can be employed. Proteins 
produced by directed evolution methods can then be screened for improved expression, activity, 
thermostability, folding, secretion, and other functions and properties according to conventional 
methods. 

Any source of nucleic acid in purified form can be utilized as the starting nucleic acid. 
Thus the process may employ DNA or RNA including messenger RNA, which DNA or RNA 
may be single or double stranded. In addition, a DNA-RNA hybrid which contains one strand 
of each may be utilized. The nucleic acid sequence may be of various lengths depending on the 
size of the nucleic acid sequence to be mutated. Preferably the specific nucleic acid sequence is 
from 50 to 50,000 base pairs. It is contemplated that entire vectors containing the nucleic acid 
encoding the protein of interest may be used in the methods of this invention. 

Any specific nucleic acid sequence can be used to produce the population of mutants by 
the present process. An initial population of the specific nucleic acid sequences having mutations 
may be created by a number of different known methods, some of which are set forth below. 

Error-prone polymerase chain reaction (20,45,46) and cassette mutagenesis (38-44), in 
which the specific region optimized is replaced with a synthetically mutagenized oligonucleotide 
can be employed in the invention. Error-prone PCR can be used to mutagenize a mixture of 
fragments of unknown sequences. These techniques can also be employed under low-fidelity 
polymerization conditions to introduce a low level of point mutations randomly over a long 
sequence, or to mutagenize a mixture of fragments of unknown sequence. 
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Oligonucleotide-directed mutagenesis, which replaces a short sequence with a 
synthetically mutagenized oligonucleotide may also be employed to generate evolved 
polynucleotides having improved expression. 

Alternatively, nucleic acid or DNA shuffling, which uses a method of in vitro or in vivo, 
generally homologous, recombination of pools of nucleic acid fragments or polynucleotides, can 
be employed to generate polynucleotide molecules having variant sequences of the invention. 

Parallel PCR is another method that can be used to evolve polynucleotides for improved 
expression, function or properties in conventional expression systems, which uses a large number 
of different PCR reactions that occur in parallel in the same vessel, such that the product of one 
reaction primes the product of another reaction. Sequences can be randomly mutagenized at 
various levels by random fragmentation and reassembly of the fragments by mutual priming. 
Site-specific mutations can be introduced into long sequences by random fragmentation of the 
template followed by reassembly of the fragments in the presence of mutagenic oligonucleotides. 

A particularly useful application of parallel PCR, which can be used in the invention, is 
called sexual PCR. In sexual PCR, also known as DNA shuffling, parallel PCR is used to 
perform in vitro recombination on a pool of DNA sequences. Sexual PCR can also be used to 
construct libraries of chimaeras of genes from different species. 

The polynucleotide sequences for use in the invention can also be altered by chemical 
mutagenesis. Chemical mutagens include, for example, sodium bisulfite, nitrous acid, 
hydroxylamine, hydrazine or formic acid. Other agents which are analogues of nucleotide 
precursors include nitrosoguanidine, 5-bromouracil, 2-aminopurine, oracridine. Generally, these 
agents are added to the PCR reaction in place of the nucleotide precursor thereby mutating the 
sequence. Intercalating agents such as proflavine, acriflavine, quinacrine and the like can also 
be used. Random mutagenesis of the polynucleotide sequence can also be achieved by irradiation 
with X-rays or ultraviolet light, or by subjecting the polynucleotide to propagation in a host 
(such as E. coli) that is deficient in thenormal DNA damage repair fijnction. Generally, plasmid 
DNA or DNA fragments so mutagenized are introduced into E. coli and propagated as a pool 
or library of mutant plasmids. 
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Alternatively a mixed population of specific nucleic acids may be found in nature in that 
they may consist of different alleles of the same gene or the same gene from different related 
species (/.e., cognate genes). Alternatively, they may be related DNA sequences found within 
one species, for example, the oxidase class of genes. Once the mixed population of the specific 
nucleic acid sequences is generated, the polynucleotides can be used directly or inserted into an 
appropriate cloning vector, using techniques well-known in the art. 

Once the evolved polynucleotide molecules are generated they can be cloned into a 
suitable vector selected by the skilled artisan according to methods well known in the art. If a 
mixed population of the specific nucleic acid sequence is cloned into a vector it can be clonally 
amplified by inserting each vector into a host cell and allowing the host cell to amplify the vector. 
The mixed population may be tested to identify the desired recombinant nucleic acid fragment. 
The method of selection will depend on the DNA fragment desired. For example, in this 
invention a DNA fragment which encodes for a protein with improved properties can be 
determined by tests for functional activity and/or stability of the protein. Such tests are well 
known in the art. 

Using the methods of directed evolution, the invention provides a novel means for 
producing functional, and soluble proteins with improved activity toward one or more substrates. 
The mutants can be expressed in conventional or facile expression systems such as E. coli. 
Conventional tests can be used to determine whether a protein of interest produced from an 
expression system has improved expression, folding and/or functional properties. For example, 
to determine whether a polynucleotide subjected to directed evolution and expressed in a foreign 
host cell produces a protein with improved activity, one skilled in the art can perform 
experiments designed to test the functional activity of the protein. Briefly, the evolved protein 
can be rapidly screened, and is readily isolated and purified from the expression system or media 
if secreted. It can then be subjected to assays designed to test functional activity of the particular 
protein in native form. Such experiments for various proteins are well known in the art. and are 
discussed in the Examples below. 

In one embodiment, the invention contemplates the use polynucleotides encoding for 
variants of oxidase enzymes. The invention employs directed evolution to generate novel 
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oxidase enzymes, such as GAO, which are expressed in host cells {e.}^, K. coli) used in an 
expression system, and which exhibit increased functional activity and increased thermostability. 

The invention can also be applied to select or optimize an expression system, including 
selection of host cells, promoters, and signal sequences Expression conditions can also be 
optimized according to the invention 

Directed Evolution of Galactose Oxidase 

Galactose oxidase (EC 1 . 1 .3.9) is an alcohol oxidase enzyme. It oxidizes the hydroxyl 
group of the sixth carbon of D-galactose. It also oxidizes many other kinds of sugars and 
alcohols (77, 108, 114, 115, 1 18-120). Although many fijngi produce galactose oxidase, no 
bacterium has been reported to produce the enzyme (109). There are many reports about 
galactose oxidase from Fusahum ssp. NRRL2903, which is identical to Dactylmm deridroides 
ATCC46032 (76-78, 84-86, 88, 95, 99, 108, 110-128). FIG. 1 The native enzyme is an extra- 
cellular monomer enzyme and has molecular weight as 67,000. It has one copper (II) ion 
associated with it active site and related to its oxidation properties. FIG. 2. Structure and 
amino acid residues related to catalysis have been characterized and reported (76, 78, 84-86, 88, 
111-113, 116-119). 

Galactose oxidase is currently used mainly for assays of D-galactose and D- 
galactosamine. The enzyme oxidizes the hydroxyl group in the substrate to an aldehyde, which 
is reactive. Therefore, the enzyme is implicated for use in production of non-natural sugars and 
derivatives of sugars (118, 1 19, 95, 99, 128). Hyper-production of galactose oxidase would be 
useful for a wide variety of applications. The gene of the galactose oxidase has been cloned 
(110) and expressed in Escherichia coli (127). This recombinant galactose oxidase was 
produced as a fused protein with the N-terminal sequence of LacZ. However, the yield of the 
galactose oxidase by this recombinant E, coli was not satisfactory. 

According to the invention, galactose oxidase enzyme (GAO) has been produced in high 
activity and with improved properties by recombinant techniques in E. coli. 
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The following Examples are understood to be exemplary only, and do not limit the scope 
of the invention or the appended claims A person of ordinai7 skill in the art will appreciate that 
the invention can be practiced in many forms according to the claims and disclosures here. 

EXAMPLE 1 

Activity Assays for Galactose Oxidase Expressed in E. coli 

This Example describes assays used for evaluating galactose oxidase activity. Galactose 
oxidase generates equimolar amounts of hydrogen peroxide by oxidation of a substrate. 
Colorimetric detection of hydrogen peroxide was therefore used to assay galactose oxidase 
activity, employing the following reaction scheme: 

R-CH,OH + 0, -°^> R.CHO + H,0, ^!!?^, 

chromogen 

I 

color change 

This system can be used to assay for oxidation of various substrates, with a very high 
sensitivity. In the reaction scheme above, an alcohol group of a substrate R is oxidized to 
produce an aldehyde and hydrogen peroxide (H^O^) is released. For example, D-galactose is 
converted to D-galactohexodialdose plus H^O^. The chromogen, in the presence of hydrogen 
peroxide and peroxidase enzyme, e.g. horseradish peroxidase (HRP), produces a detectable color 
change, indicating that the reaction catalyzed by GAO has occurred. 

A. Test Tube Assay 

The activity of galactose oxidase produced in E. coli was investigated using fungal 
galactose oxidase (Sigma, partially purified) as a standard. For detecting hydrogen peroxide with 
peroxidase (Sigma, type I from horseradish), a chromogen was selected for the GAO assays (85). 
/. Materials 
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Cells. K co/i DHSaMCR (Life Technologies) was used for gene manipulation. /::. co/i 
BL2 1(DE3) (Novagen) was used as a host strain for expression of galactose oxidase gene. E. 
co/i KY-14478 (SN0029, lacking catalase, Kyowa Hakko Kogyo, Co. Ltd.) was also used for 
manipulation and expression of genes ( 1 57). Competent cells for electroporation were prepared 
(147). 

Cultivation Media. Luria-Bertani LB medium (10 g/1 bacto tiyptone, 5 g/1 bacto yeast 
extract, 10 g/1 NaCl, pH 7.5) was used mainly for cultivation of E call (19). LB plates 
contained 15 g/1 agar in LB medium. Ampicillin (100 mg/1) was added to the medium when 
required. 

Buffers. Solutions of sodium phosphate, potassium phosphate and Tris-HCI at various 
pHs were tested as buffer solution for the assay. 

Chromogem. Many aromatic compounds can be used as a chromogen for the assay. 
Four chromogens showed particularly strong color formation; green, orange, red and red, 
respectively: (a) 2,2'-azinobis(3-ethylbenzothiazoline-6-sulfonic acid) (ARTS) (85); (b) o- 
anisidine;(c)o-dianisidine(127, 123, 121, 122) and (d) o-tolidine (1 14, 119). Their peaks of 
absorbance were 410 nm, 490 nm, 460 nm and 420 nm. 
2. Methods 

Cultivation. Three steps of cultivation were performed for production of galactose 
oxidase. Recombinant E. coli strains were cultivated on LB plate containing ampicillin at 30 °C 
for 1 8 hours. The cells were inoculated to LB containing ampicillin. After cultivation at 30 °C 
for 1 2 hours, the culture was transfered to a new test tube containing 3 ml LB supplemented with 
ampicillin. The inoculation rate was 0.5 % of medium. Isopropyl beta-D-thiogalactopyranoside 
(IPTG) (1 mM) was added for induction after cultivation at 30 °C for 7 hours. Cultivation was 
continued at 30°C for 6 hours. 

Permeabilization. Permeable cells were prepared by freezing (-20X) - thawing (4°C) 
and treatment with 0.5 mg/1 lysozyme (Sigma, from chicken egg white) for 30 minutes at 37°C, 
This pre-treatment for permeablization was used for assay in evaluation of recombinant galactose 
oxidase. (Example 3). 

Activity assay. The extract was assayed for galactose oxidase activity, Copper (II) 
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sulfate solution (0.4 mM) was added to the cell-free extract. The cell-free extract was diluted 
in the buffer solution Peroxidase (Sigma, type 1 from horseradish) ( 1 0 units/ml) and azinobis(3- 
ethylbenzothiazoline-6-sulfonic acid) ( ABTS) (2 g/1) were added to the reaction solution. The 
reaction solution was pre-incubated at 37 °C for 5 minutes. Substrate was added to the solution 
5 to be 100 mM. The increase of absorbance (410 nm or 405 nm) was measured at 37 °C for 1 

minute. Fungal galactose oxidase (Sigma, partially purified) was used as standard for estimation 
of the activity. 

J. Results 

From these experiments, ABTS was selected as a preferred chromogen for these types 
10 of assays, since ABTS formed its color most strongly and sensitively. Moreover, the highest 

assay sensitivity and lowest background was achieved when using a 100 niM sodium phosphate 
buffer solution (pH 7.0) for the assay. 

Minimum detectable activity of galactose oxidase for this assay system was 0.05 units/ml. 
Galactose oxidase activity between 0.1 and 1 units/ml was measured quantitatively by 
1 5 photometer at 4 1 0 nm or 405 nm. 

Catalase produced by E. coli degrades hydrogen peroxide and may influence the assay. 
In practice, catalase was not observed to pose a problem, because the activity of the galactose 
oxidase was greatly higher than that of catalase. 

Provided below are additional galactose oxidase screening techniques and/or activity 
20 assays, having the following advantages: high specificity for galactose oxidase, high sensitivity, 

good reproducibility, quantitative measurements, simplicity, flexibility for many substrates, and 
low cost. One screening system utilizes microplates and the other utilizes membranes. Both 
systems applies horseradish peroxidase (type I, Sigma) together with a chromogen (ABTS). 



25 B. Microplate Screening Method 

The following micro-plate assay has a high sensitivity. Moreover, the enzyme activity 
can be determined quantitatively. To increase throughput, the method can be automated, for 
example robotically. This method is particularly suitable as a second screen, after active clones 
are identified by a more rapid first screen, such as a membrane screen. In experiments using 
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these procedures, the active cultures on the microplate had galactose oxidase activity as indicated 
by strong green color formation, where each positive well on the microplate was visible as a dark 
circle. GAO activity was screened in 96-well plates. 

Briefly, single colonies were picked from LB-Ampicillin (LB-Ap) agar plates into deep- 
5 well plates and grown in LB-Ap. The master plates were duplicated into new deep-well plates 

containing LB-Ap- 1 mM IPTG. Following cultivation at 30°C, CUSO4 was added and the cells 
were lysed with lysozyme and SDS Cell extracts were reacted with galactose and allyl alcohol 
using the GAO-HRP coupled assay described above. 
/. Methods for Approach A 

1 0 Single colonies were picked from Luria-Bertani/ 1 00 /ig/ml ampicillin (LB-Ap) agar plates 

into deep-well polypropylene plates (well depth: 2.4 cm; volume: 1 ml; from Beckton Dickinson 
Labware) and cells were grown for 10 h at 30 °C and 270 rpm in 200 jA LB-Ap. The master 
plates were duplicated by transferring a 10 /^l aliquot to a new deep-well plate containing 300 
//I LB-Ap and 1 mM isopropyl-beta-D-thiogalactopyranoside (IPTG) and grown for 12 h at 30 

1 5 X and 250 rpm. The cultures were then centrifuged for 10 min at 5000 rpm and the cell pellet 

was resuspended in 300 />/l 1 00 mM sodium phosphate (NaPi) buffer, pH 7.0 containing 0.4 mM 
CUSO4. Following addition of 0.5 mg/ml lysozyme (35 min at 37 °C) and 2.5% (w/v) SDS 
(overnight at 4 ° C ), the GAO activity was assayed using the G AO-horseradish peroxidase (HRP) 
coupled assay (85). Aliquots of the cell extracts were reacted with galactose (50 mM for 

20 generation A 1 or 25 mM for generations A2 and A3) and allyl alcohol (0.5 M for all generations) 

at pH 7.0. The initial rate of H2O2 formation was followed by monitoring the HRP-catalyzed 
oxidation of 2,2'-azino-bis(3-ethylbenzthiazoline-6-sulfonic acid) (ABTS) at 405 nm. To assay 
thermostability, the plates were heated at a given temperature for 10 min, cooled down on ice 
for 10 min, and allowed to reach room temperature for ca. 5 min before the activity toward 

25 galactose was measured. The thermostability index was determined from the ratio of the residual 

GAO activity to the initial activity. Mutants identified as thermostable were then grown in test 
tubes (3 ml cultures) and the residual activity after heating at various temperatures was measured 
at room temperature. 

2. Methods for Approach B 

30 Single colonies were picked from LB-Ap agar plates into deep-well polypropylene plates 

(well depth: 4.4 cm; volume: 2.2 ml; from Qiagen) and cells were grown for 8 h at 30 °C and 
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270 rpm in 500 ij\ LB-Ap. The master plates were duplicated by transferring a 10 /^l aliquot to 
a new deep-well plate containing 500 jul LB-Ap- 1 mM IPTG and grown overnight at 30 °C and 
270 rpm. An aliquot of the culture was transferred to a microtiter plate. Following addition of 
0.5 mg/ml (30 min at 37 'C) and 0,4% (w/v) SDS - 0.4 mM CUSO4 in 100 mM NaPi buffer, pH 
5 7.0 (4 h at 4 °C), the GAO activity was assayed using the GAO-HRP coupled assay as described 

above. The galactose concentration used was 25 mM (generations Bl and B2) or 10 mM 
(generations B3 and B4). 

C. Membrane Screening Method 

10 Although the micro-plate screening system is highly sensitivity and quantitative, it is 

desirable to provide a method that contemporaneously assay many more, e.g. thousands more 
clones in a sensitive, accurate, practical and efficient manner. Methods for detection of galactose 
oxidase activities directly from colonies on agar-plate were examined, but were found to exhibit 
relatively low sensitivity, low reproducibility, and very slow color formation. Hence, to evaluate 

1 5 very large number of mutants, methods for detection of their activities directly from colonies on 

agar-plate or from colonies transferred onto a membrane were examined. These methods were 
based on colorimetric detection using chromogen and peroxidase, as in the micro-plate screening 
system. 

A suitable screening method using membranes was developed, as is shown here in one 
20 optimized form. After transformants formed colonies on an LB-Ap plate (100 mg/1 at 30 °C for 

18-24 hours), these colonies were transferred to a membrane, i.e. they were adsorbed onto the 
membrane and lifted, for cultivation, the membrane was placed on a new LB-Ap plate (100 
mg/1) and was incubated at 30 °C till new colonies were formed on the membrane (6-12 hours). 
The membrane then was transferred to a new LB-Ap (100 mg/1) plate with 1 mM IPTG, at 30 
25 °C for 6 hours, for induction. Then, the membrane was put on a filter paper at room 

temperature, containing lysozyme (0.5 mg/ml), D-galactose (100 mM), ABTS (2 mg/ml), 
peroxidase ( 1 0 units/ml) and CUSO4 ( 0.4 mM). In experiments using these procedures, colonies 
which had galactose oxidase activities showed as deep purple on the filter paper. This simple 
method has suitable sensitivity and can be used to evaluate several thousands colonies on one 
30 membrane at once. 

Several thousands colonies can be evaluate by the screening method with one membrane. 
This method can be used with an image analyzer, for quantitative determination of activity of 
each colony. Although the sensitivity of this method is not as high as others, the method is fast 
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and is suitable for a first or initial screening, because many thousands or even millions of colonies 
can be contemporaneously or rapidly evaluated 

In a preferred embodiment, galactose oxidase activities of colonies which were 
transferred on a membrane were estimated directly. Colonies, which were formed on LB- 
Apicillin plate at 30 °C for 24 hours, were transferred onto a membrane (Immobilon NC 
(HATF), surfactant-free, 45 mm, 82 mm, Millipore). The membrane was put on a new LB- 
Apicillin plate and was kept at 30 °C for 6-- 12 hours till colonies were re-formed. Then the 
membrane was transferred onto an LB-Apicillin plate containing 1 mM IPTG and was incubated 
for 6 hours at 30 ""C. After the membrane was put on filter paper containing 0.5 mg/1 lysozyme, 
100 mM substrate, 2 mg/ml ABTS, 10 units/ml peroxidase and 0.4 mM CUSO4 in 100 mM 
sodium phosphate buffer solution (pH 7.0), the membrane was kept at room temperature for one 
day, covered with a shield (ABTS is light sensitive). Active colonies showed deep purple color 
formations. 

D. Assay Reagents and Conditions 

Some of the assays herein use CUSO4, and/or SDS. 

Copper sulfate is used to provide copper (II) ion to activate the recombinant (mutant or 
variant) enzyme. The activity of partially purified galactose oxidase from D. dendroides (Sigma) 
was detected well by using peroxidase and ABTS as described; the addition of copper (II) ion 
and other cofactors was not needed. (The Sigma enzyme already includes copper ions.) 
However, experiments with cell-free extracts of recombinant GAO enzymes of the invention 
showed that almost no activity was detected in the absence of copper (II) ions. Thus, the 
presence of copper (II) ion is preferred, and without being bound by any theory, is believed to 
be essential, to activate recombinant GAO enzymes produced by E. coli as described herein. 
Treatment with copper ions at 4 °Cis preferred. Copper ion can be provided as copper sulfate 
(CuSOJ. Experiments showed that 0. 1 mM CUSO4 is sufficient, whereas 10 mM CUSO4 slightly 
inhibited GAO activity. Experiments under assay conditions showed that the preferred 
concentration of CUSO4 for activating crude enzyme solution is 0.4 mM. The metal (II) ions of 
iron, cobalt, nickel, and manganese, and the metal chelator EDTA, did not affect activation of 
the recombinant GAO in experiments under assay conditions. Experimental results are shown 
in FIG. 3. under assay conditions, with and without various metal (II) ions or EDTA. 
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Detection enhancers. In certain assay embodiments, sodium azide or sodium sulfide may 
be added, for example in an amount of from about 0 0 1 mM to less than 1 mM These reagents 
may enhance detection of GAO activity in some circumstances. 

Deter}^ef}ts. Addition of detergents to the assay solution also increased the observed 
activity. Pretreatment with SDS was most effective for increasing the galactose oxidase activity. 
Treatment with SDS for longer than 1 2 hours at 4 °C after treatment with lysozyme was suitable 
for the assay. The galactose oxidase activity did not change within the treatment for 12 to 24 
hours at 4 °C. Cultivation, pre-treatment and assay were done as described above. 

Other detergents may also be used, as shown in TABLE 1. In these experiments, 
approximately 0. 1 units/ml culture of coli BL2 1 (DE3)/pGAO-0 1 0 and 0.25 units of partially 
purified galactose oxidase (Sigma) were used. Cells were treated with 0.5 mg/ml lysozyme at 
37 °C for 30 minutes. Enzyme and cells were treated with detergents at 4 °C for 1-12 hours. 
Galactose oxidase activities were assayed using the microplate method described above. 

Cultivation. Activation on LB-Ap (100 mg/1) plate for 12-24 hours at 30 ""C and seed- 
cultivation in LB-Ap (100 mg/1) 200-500 /^1/well for 8-10 hours at 30 °C provided uniform 
growth for cultivation. These conditions are suitable if not necessary for the assay, using the 
cells, reactants and reagents in these experiments. 

The addition of IPTG as an inducer was observed to be necessary for the expression of 
galactose oxidase on microplate cultivation in these experiments. Initial addition of IPTG to the 
medium was preferred to the addifion of IPTG during cultivation. A cultivation time of 12-16 
hours was preferred, and provided superior results (overall higher activities) for almost all 
recombinant E. coli which had a plasmid for expression of galactose oxidase in these 
experiments. The growth of cells was stopped before 16 hours and the cell extracts had almost 
no activity at 37 °C. Cultivation at about 30 ""C was the optimal temperature in these 
experiments. 
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TABLE 1 



SDS 1% \v/v| 


Trealnieni 


0 


0.00001 


O.OOOl 


0.001 


0.01 


0 1 


Relative activity of culture |**ol 


4"C. 12 hr 


100' 


131 


106 


146 


241 


394 


Relative activity of culture 


4T, 1 hr 


100- 


96 


118 


134 


146 


189 


Relative activity of GAO' (^.| 


4"C, 12 hr 


100' 


99 


95 


103 


99 


101 


Triton X-l()() |% w/v| 


Treatment 


0 


0.00001 


0.0001 


0,001 


0.01 


1 


Relative activity of culture |"o] 


4^^C. 12 hr 


100' 


133 


145 


190 


220 


250 


Relative activity of culture ("o] 


4"C, I hr 


100' 


85 


95 


123 


118 


149 


Relative activity of GAO ["o] 


4"C, 12 hr 


100' 


114 


114 


109 


108 


98 


Twecn 80 [% w/v ] 


Treatment 


0 


0.00001 


0.0001 


0.001 


0.01 


1 


Relative activity of culture I".)] 


4"C. 12 hr 


100* 


135 


113 


142 


139 


140 


Relative activity of culture |**o] 


4^C, 1 hr 


100' 


159 


125 


144 


122 


139 


Relative activity of GAO [*? o| 


12 hr 


100' 


120 


113 


106 


114 


102 


DMSO [% w/\'] 


Treatment 


0 


0.00001 


0.0001 


0.001 


0.01 


1 


Relative activity of culture 


4'C, 12 hr 


100' 


152 


140 


150 


155 


152 


Relative activity of culture |" o] 


4°C. 1 hr 


100' 


169 


106 


116 


116 


96 


Relative activity of GAO [%\ 


4'C. 12 hr 


100' 


104 


107 


103 


97 


99 



('0.09 units/ml, '0.07 units/ml, ^0.25 unils/ml) ^GAO obtained from SIGMA 

EXAMPLE 2 
Construction of Galactose Oxidase Plasmids 



Plasmids were constructed to express galactose oxidase gene (gao) from Fusarium ssp. 
as described below. Several vectors were examined for high expression. Plasmids with different 
promoters and different sequences between the GAO gene and the ribosime binding site were 
constructed, as described. Escherichia coli strain BL21(DE3) and KY- 14478 were transformed 
with these plasmids. Permable cells from test tube cultures were used for the assay. 
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A. Consfriir tion of PlasmiHQ 

/. Modified pUC 18 Vector Plasmids 

Modified PUC18 plasmids were made to be used for constructing galactose oxidase 
expression plasmids. As shown in FIG. 7. vector pUC18 was digested with the restriction 
enzyme HindlU, blunted with T4 DNA polymerase and ligated with T4 DNA ligase to create 
vector pUC 1 8-HL lacking the HindlU site. pUC 1 8-HL was digested with EcoKl, blunted with 
T4 DNA polymerase and ligated with T4 DNA ligase to create vector pUC18-EHL lacking the 
Ecom and Hindlll sites. Similarly, pUC 1 8-EHL was digested with Pstl, blunted with 14 DNA 
polymerase and ligated with T4 DNA ligase to create vector pUC 1 8-EHPL, lacking the EcoKl, 
HincflU, and Pstl sites. 

2. GAO Vector Plasmids 
As shown in FIG. 8, plasmid pGAO-Ol 0 expressing GAO was made using plasmid pR3. 
Plasmid pR3 contains the gene for mature galactose oxidase (GAO) fused to the 5' end of the 
/acZ fragment, and was obtained from Dr. Howard K. Kuramitsu (Dept. of Oral Biology, State 
University of New York, Buffalo, NY). The GAO gene was amplified from pR3 by PGR using 
primers P-MYOOl and P-MY002 in order to introduce aMl restriction site followed by an 
ATG initiation codon immediately upstream fi-om the mature GAO sequence, and an Xbal site 
immediately downstream from the stop codon. (Primer sequences are shown in FIG. 6). The 
PGR product was digested with Hindlll and Xbal and ligated into a similarly digested pUC18 

vectortocreatepGAO-OOl.PlasmidpPLA-OOlisamodifiedpUClSvectorcontainingadouble 
lac promoter. The lac promoter from pUC18 was amplified using primers P-MY003 and P- 
MY004. The PGR product was digested with EcoRl and Hmdlll and ligated into a similariy 
digested pUC18 vector. Following digestion of pGAO-001 with //MI andJ^al, pPLA-001 
with Ecom and Hindlll and pUG18-HL with EcoRl and Xbal, plasmid pGAO-010 was 
generated by ligation with T4-DNA ligase. 

Another plasmid, pGAO-036, was made by amplifying pGAO-010 using primers P- 
MY036 and P-MY002, FIG. 9. The PGR product was digested with Kpnl and Xbal and ligated 
with a similariy digested pUGl 8-EHL to create plasmid pGAO-027. Plasmid pGAO-027 was 
digested with Kpnl and Xbal and ligated with a similariy digested pUGl 8-EHPL to create 
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plasmid pGAO-036. This plasmid contains a unique /^/l site. Plasmid pGAO-036 was used as 
a for directed evolution experiments described herein 

Another plasmid, pGAO-0 1 1 , was made using similar techniques, as shown in FIG. 10, 

B. Plasmids and transformation 

Plasmids for expression of galactose oxidase were constructed as described above. The 
galactose oxidase enzyme was amplified from pR3 (Fusarium ssp.) by PGR. The lac promoter 
of pUC 1 8 and T7 promoter of pET-22b(+) (Novagen) were used for expression. In addition to 
expression as mature sequence of galactose oxidase, expression of the gene as a fused protein 
with other peptides was examined. The N terminal sequence of W was selected to express 
the galactose oxidase as a flised protein ( 1 27). Pe/B leader sequence was also used to produce 
galactose oxidase in periplasm. Furthermore, His-tag which is useful for purification of 
recombinant proteins was examined as an additional sequence of the C-terminal of galactose 
oxidase. 77 terminator sequence was used for stabilization of expression. Two different oris 
were chosen for replication of plasmid. The copy number of plasmid with ori from pUC series 
is higher than the plasmid with ori from pBR series. 

In more detail, plasmids pUC18, pET-22b(+) (Novagen) and derivatives were used as 
vector plasmids. Galactose oxidase gene from Fusarium ssp. was amplified from pR3 according 
to known techniques. (110, 127). Genes were manipulated according to conventional methods 
using kits from Qiagen (Valencia, CA). The QIAprep Spin Miniprep Kit, QIAquick Gel 
Extraction Kit and QIAEX II Gel Extraction Kit, were used resepctively for purificafion of 
plasmids from cells, purification of DNA fragments and extraction of DNA fi-agments fi-om 
agarose gel. E. coli DH5aMCR was transformed with plasmids by treatment with CaCl, (19), 
Electroporation was used for transformation of £. coli BL21(DE3) with plasmids (147, 148). 

pUC 1 8 and pET-22b(+) (Navagen) were used as vector plasmids. The gene of galactose 
oxidase from pR3 (127) was used, lac promoter from pUC18, tac promoter fi-om pKK223-3 
(Amercham Pharmacia Biotech) and T7 promoter from pET-22b(+) were selected for expression 
of the gene. The N terminal sequence of ZacZ from pUC18, PelB leader, His-tag and T7 
terminator sequences from pET-22b(+) were used for production of galactose oxidase. The 
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gene and parts for expression were prepared by PCR PCR was done in 100 ml of reaction 
solution containing PCR buffer (10 mM Tris-HCl pH 8.5. 50 mM KCl, 2.5 mM MgCl2, 0.01 
% gelatin), 1 ng of DNA as template, 50 p mole of each primers, 2.5 units of Taq DNA 
polymerase (Perkin Elmer) and 50 n mole of each dNTPs. DNA fragments were amplified in 30 
cycles of 30 seconds at 94 X, 30 seconds at 50 X and 60 seconds at 72 X. PCR products 
were purified by QIAquick PCR Purification Kit (Qiagen). Cutting and ligation of DNA by 
enzymes were according by "molecular cloning" (19). E. coli cells were transformed with 
plasmids by electroporation (Bio-Rad, gene Pulser). QIAprep Spin Miniprep Kit (Qiagen) was 
used for purification of plasmid from E. coli recombinant cells. 

Using these strategies, plasmids were designed to produce the galactose oxidase gene. 
The plasmids were transformed to E, coli DH5aMCR, BL21(DE3) and KY-14478. 
Representative plasmids are shown diagrammatically in FIG. 11, according to the general 
scheme shown in FIG. 12. 

Expression of the galactose oxidase gene in all constructed plasmids was controlled by 
the lac operator. Therefore, induction by isopropyl b-D-thiogalactopyranoside (IPTG) was 
necessary for production of the enzyme (FIG. 11). The expression of galactose oxidase was 
highest when IPTG (1 mM) was added after cultivation for 7 hours and cells were incubated for 
6 more hours. Cultivation at 30 °C gave greatest activity of galactose oxidase per cultivation. 
Expression of the enzyme was remarkably decreased at 37 ""C. Lower temperatures than 27 °C 
were not suitable in the experiments because the cells grew very slowly. 

Incubation on LB plate at 30 °C for 18 hours and pre-cultivation in LB at 30°C for 12 
hours stabilized the main cultivation. The optimal culture conditions were selected as shown 
above. 

C. Galactose oxidase activity 

Galactose oxidase activities of the recombinant E. coli were measured (FIG. 1 1) Some 
recombinant strains showed much higher activities than the recombinant plasmid pR3. These 
recombinants hold plasmids which were constructed with lac promoter and ori from pUC series. 
Some recombinant E. coli with plasmids, pGAO-018 and pGAO-023, expressing the galactose 
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oxidase gene by 77 promoter did not grow well. Their galactose oxidase activities were not 
detected. Although some recombinants holding plasmid with 77 promoter, pGAO-008 and 
pGAO-009, grow normally, they showed low galactose oxidase activity. From these results, lac 
promoter was suitable for expression of galactose oxidase gene. Furthermore, double /«c 
promoter seemed to be stronger than single lac promoter in some but not all cases. 

For example, plasmid pGAO-025 was designed to have double lac promoter and lacZ- 
Sao fUsed gene (FIG. 13). However, galactose oxidase activity of a recombinant with pGAO- 
025 was almost the same as a recombinant with pGAO-01 1 which had a single lac promoter in 
KY-!447 cells but was more active than pGAO-01 1 in BL2](DE3) cells. Triple lac promoter 
was also examined to express the galactose oxidase gene. The effect of triple promoter was 
about the same as double promoter, e.g. in pGAO-028 and pGAO-010 (FIGS. 15 and 17). 

Galactose oxidase which was ftised with the N-terminal sequence of W or PelB leader 
was produced, as well as non-fosed proteins. The activity of galactose oxidase fbsed with PelB 
leader was not detected without a pre-treatment of cells. Detection of activity of the enzyme 
required same the pre-treatment of recombinant cells as others. In these experiments GAO was 
not secreted in the medium, although a secretion signal sequence was present. 

Plasmids pGAO-003 and pGAO-005 were designed to produce galactose oxidase in 
fused form with His-tag at its C-terminal. No galactose oxidase activity was detected from 
recombinant strains with these plasmids. 

Terminator sequence sometimes stabilizes gene expression. In these experiments, 
introduction of T7 terminator sequence apparently did not increase GAO expression. Compare 
pGAO-020 with pGAO-010 or pGAO-022 with pGAO-017. 

E. coli DH5aMCR expressed the galactose oxidase gene with these plasmids. However 
their activities were lower then that of recombinant strains of £. coli BL21(DE3) and E. coli 
KY-14478 (data not shown). E. coli BL21(DE3) and K coli KY-14478 with plasmid pGAO- 
0 1 0 or pGAO-027 successfully expressed galactose oxidase in high activity. These two plasmids 
have the same sequence except for one restriction endonuclease site in the vector sequence. 
Their structure is suitable to express the galactose oxidase in a mature llmgal sequence. 
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Consequently, fi co/i BL2 1 (DE3 ) and K coh K Y- 1 4478 harvesting plasmid pGAO-0 1 0, pGAO- 
027 or their derivatives were used for continued experiments. 

D. Codon Alternation 

Codon ahernation of the N-terminal sequence of a gene, without changing the peptide 
sequence, may cause higher expression of the gene in some cases, Codons of six N-terminal 
amino acid residues of galactose oxidase were exchanged randomly by PGR with a mixed primer, 
with the following alternations. 

SEP ID NO: 

A S A P I G 3 A 26 
Wild-type sequence ATG GCC TCA GCA COT ATC GGA AGO GCC . . . 27 
Random Alternation --- — n — n — n — N —A — N " 28 

T 
C 

The galactose oxidase gene of pGAO-010 was replaced with PCR products comprising 
the galactose oxidase gene with random codon ahernation. The plasmids of this library were 
named pGAO-OlOM. This random codon alternation of the N-terminal sequence did not cause 
higher expression (FIG. 14), and in many cases GAO activity was reduced. No significant 
difference was observed when E. coli KY-14478 was used as a host strain, compared with E. 
CO// BL21(DE3). 



E. Optimizati on of upp e r sequence of p an 

The region between the Shine-Dalgamo ("SD") sequence AGGA and the initiation 
codon, ATG, is sensitive for efficient RNA translation and has a significant influence on 
expression of gene. One to three bases were inserted between SD of the lac promoter and the 
ATG of the galactose oxidase gene in pGAO-027 to investigate the impact of altering the 
distance between SD and ATG. A change in the length of the region between SD and ATG 
causes a decrease in galactose oxidase activity when E coli BL21(DE3) was used as a host 
strain (TABLE 2; SEQ ID NOS: 29-36). The original sequence of pGAO-027 or the one-base 
extended sequence of pGAO-029 were preferred for expression of the gene. When E coli KY- 
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14478 was used as a host strain, one or two bases extension of the sequence between SD and 
ATG were preferred to express the gene. 



TABLE 2 



10 



15 



20 



25 



Plasmid' 



Sequence between SD and ATG 



027 



029 



030 



031 



032 



033 



034 



035 



.AGGAAAAGCTTATG. . 



■ AGGAAAAAG C T TAT G . . 



•AGGAAACAAGCTTATG. . . 



Promoter 



?lac 



.AGGAACAAAGCTTATG. . . 



•AGGAAAAGCTTATG, 



.AGGAAAAAGCTTATG. . 



• AGGMACAAGCTTATG. . 



AGGAACAAAGCTTATG. . . 



?tac 



GAO Activity (units/ml) 



BL21(DE3) 



19.0 



19.1 



16.3 



KY- 14478 



12.5 



15.7 



14.3 



30.6 



25.7 



34.6 



22. 



•Plasmids are designated pGAO-XXX, where XXX is 027 through 03 ? 



15.9 



13.1 



52.4 



56.2 



49.8 



38.7 



The tac promoter often if not usually expresses genes at higher levels than lac promoter. 
tac promoter was prepared from pKK223-3 (Amercham Pharmacia Biotech) by PGR. lac 
promoters of plasmids, pGAO-027, pGAO-29, pGAO-030 and pGAO-03 1 were replaced with 
tac promoter. Recombinant strains with plasmids using tac promoter for expression showed 
approximately twice as much activity than the recombinant strains using lac promoter (TABLE 
3). The optimal distance between SD and ATG under the tac promoter was almost the same as 
that under the lac promoter in both E. coli strains. 

Recombinant strains £. coli BL21(DE3)/pGAO-034 and E. coli KY-14478/pGAO-033 
were considered to be good for expression of galactose oxidase. Optimal culture conditions for 
these strains were as described above. 



F. Properties of recom binant galactose oxiHasp 
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Galactose oxidase from Dactylium dendroides {Fusahum ssp.) and the enzyme from 
recombinant col, BL21(DE3)/pGAO-010 diflFers only in glycosilation; their amino acid 
sequences are identical. 

Substrate specificities of recombinant galactose oxidase from L coli and the enzyme 
from fungi were compared. Cell-free extract of /<. coU BL2 1 (DE3)/pGAO-0 10 was used as a 
crude recombinant enzyme from E. coli. Partially purified galactose oxidase from Dactylium 
dendroides (Sigma, partially purified) was used as fongal enzyme. Substrate specificities of these 
two enzymes were almost same (FIG. 15). 

EXAMPLE 3 
Optimization of error-prone PGR conditions 

A. General PGR Conditions 

Mutation of the galactose oxidase gene (gao) was induced by error-prone PGR and 
according to known techniques (66, 129-133. 136-139). Wild type ^ao on pGAO-027 was 
replaced by the PCR products which were mutant galactose oxidase genes. The resultant 
plasmids were named as pGAO-027M. E. coli BL21(DE3) was transformed with these 
plasmids. Almost all transformants carrying error prone PCR products instead of wild type gao 
lost their galactose oxidase activities (FIG. 7). Mutations were induced on the whole galactose 
oxidase gene by error-prone PCR, using conditions "A" of TABLE 3. 228 clones were selected 
randomly from each set of conditions with different manganese concentrations. These clones 
were cultivated and assayed with micro-plates. More than 65 % of transformants lost their 
galactose oxidase activity, even though manganese ions were not added to the PCR solution. 

Various reaction conditions for error-prone PCR were compared, and in particular 
milder conditions were examined for mutation of the galactose oxidase gene. Conditions "A" 
and "C" are the previous conditions of error-prone PCR (above) and normal PCR conditions, 
respectively. The use of a buffer solution for error-prone PCR (Buffer EP) increased the error 
rate. Non-uniform composition of dNTPs for error-prone PCR (dNTPs EP) induced mutations 
in a higher rate than uniform composition of dNTPs for normal PCR (dNTPs normal). Tag DNA 
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polymerase from Promega Corporation showed a higher error rate than the enzyme from Perkin 
Elmer. Since the rate of inactivation was 3 I % at most in condition "C" (FIG. 5), induction of 
mutation was not optimal, and may have been insufficient. In FIG. 5, mutations were induced 
in the whole galactose oxidase gene by error-prone PCR using conditions "C" of TABLE 3. 
Activities of 288 clones from each set of conditions with different manganese concentration were 
estimated using micro-plate screening. 

From the alternatives examined in these experiments. Error-prone PCR condition "F" had 
a suitable frequency of error and was selected to induce mutation on the galactose oxidase gene 
in further experiments. The composition of buffer solution, the content of dNTPs and 
thermophilic DNA polymerase each affected the rate of mutation. For example, the difference 
between the buffer solution for normal PCR and the buffer solution for error-prone PCR was that 
the EP buffer contained gelatin. Since gelatin is not expected to influence the error rate of the 
PCR reaction, the observed rate difference may be due to a small difference in the final pH of 
reaction mixtures with these buffer solutions. More error was induced by non-uniform content 
of dNTPs for error-prone PCR than uniform content of dNTPs for normal PCR. Selection of 
the thermophilic DNA polymerase can be significant when optimizing an error-prone PCR 
experiment, as the particular polymerase may influence the mutation rate. 

PCR conditions selected for mutation of the whole galactose oxidase gene in these 
experiments was milder than previously disclosed conditions (66, 129-133, 136-139), When the 
PCR conditions described previously were used for error-prone PCR of galactose oxidase gene, 
the mutation rate was too high, resulting in too many inactive or low activity clones. This result 
may be related to the fact that the galactose oxidase gene is as much as twice as large as genes 
previously used for error-prone PCR in the literature. Without being bound by any theory, 
deadly mutations may be induced more frequently as the target gene becomes larger. 

In TABLE 3, 96 of 288 clones were selected randomly from each library. Their 
galactose oxidase activities were estimated by micro-plate screening method. Rates of clones 
which lost their galactose oxidase activities are show in the table. 

FIG. 4 and FIG, 5 show the effect of varying amounts of MnCl2 in these experiments. 
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In the mutagenesis methods used herein, the error rate is from 1-6 mutations per 
polynucleotide, preferably 4-6, and most preferably 6. In certain embodiments with more than 
one round of directed evolution, the error rate may be different from one round to another. For 
example, the error rate may be about 1-2 mutations per polynucleotide in one round {e.g. a first 
5 round), and may be about 4-6 mutations per polynucleotide in another round {e.g. a second 

round) 



TABLE 3 

10 



PGR conditions Inactivated clones [%\ 



15 





Buffer 


M,Cl2 


dNTPs 


TaqDNA 
polymcrsae 


MnCl2 
0 mM 


MnCl2 
0.1 niM 


MnClj 
0.1 5mM 


MnCl2 
0.2mM 


MnClj 
0.4mM 


MnCl2 
0.5mM 


A 


EP 


7mM 


EP 


Promcga 


50 u/ml 


60 

(173/288) 


69 

(199/288) 


77 

(223/288) 


76 

(220/288) 


90 

(258/288) 


94 

(270/288) 


B 


EP 


7mM 


normal 


Promega 


50 u/ml 


55 

(53/96) 


61 

(59/90) 




C 


normal 


2.5mM 


nonnal 


Perkin 
Elmer 


25 u/ml 


3 

(3/96) 


10 

(10/96) 






5 

(14/288) 


9 

(27/288) 


10 

(29/288) 


11 

(31/288) 


28 

(81/288) 


31 

(90/288) 


D 


EP 


7nrM 


EP 


Perkin 
Elmer 


25 u/ml 


45 

(43/96) 


61 

(59/96) 




E 


EP 


7mM 


EP 


Perkin 
Elmer 


50 u/ml 


39 

(37/96) 


52 

(50/96) 




F 


normal 


7mM 


EP 


Perkin 
Elmer 


25 u/ml 


23 

(22/96) 


41 

(39/96) 




G 


normal 


7mM 


EP 


Promega 


50 u/ml 


41 

(39/96) 


52 

(50/96) 




H 


EP 


7mM 


nonnal 


Promega 


50 u/ml 


51 

(49/96) 


61 

(59/96) 





Buffer EP (xlO) 500 mM KCl, 100 mM Tris-HCl (pH 8.3), 0. 1% (w/v) gelaUn 

Buffer (nonnal) : (xlO) 500 inM KCl, 100 mM Tris-HCl (pH 8.3) 

dNTPs EP 0.2mM dOTP, 0.2 mM dATP, 1 mM dCTP, 1 mM dTTP 



25 dNTPs (normal) : 0.5M dGTP, 0.5 mM dATP, 0.5 mM dCTP, 0.5 inM dTTP 
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EXAMPLE 4 
Production of Galactose Oxidase Mutants 

The directed evolution of galactose oxidase (GAO) is described. GAO variants with 
increased activity toward allyl alcohol and D-galactose and increased thermostability relative to 
wild-type have been identified. 

A. Construction of GAO Mutant Libraries 

Plasmid pGAO-036, expressing wild-type GAO, was used as the parent for the directed 
evolution of GAO (FIG. 9). 

Two strategies have been followed for the directed evolution of the enzyme: (A) 
mutagenesis of the whole GAO gene (bases 1-1917) and (B) mutagenesis of part of the GAO 
gene (bases 518-1917). In Approach A, two rounds of error-prone PGR (45) have been 
performed (generations Al and A2), followed by one round of StEP recombination (generation 
A3) (139) of four improved variants identified in library A2. In Approach B, four rounds of 
error-prone PGR (45) have been performed (generations Bl through B4). E. coli strain 
BL21(DE3) (Novagen) was used for the expression of GAO. 
7. Approach A 

Error-prone PGR was carried out in a 100 jul reaction mixture containing about 0.3 //g 
plasmid DNA as template, 30 pmol of each primer, 0.2 mM dGTP, 0.2 mM dATP, I mM dCTP, 
1 mM dTTP, 7 mM MgCl2, 0. 1 mM MnCl2, and 2.5 U Tag polymerase (Perkin Elmer) in 10 mM 
Tris-HCl, 50 mM KCl buffer, pH 8.5. PGR conditions were as follows: 30 cycles of 94 X for 
30 seconds, 50 °C for 30 seconds and 72 °C for 60 seconds. The percentage of inactive clones 
was between 30 and 50%. 

StEP recombination of the four improved variants identified in generation A2 was 
performed in a 1 00 iu\ reaction mixture containing about 0.3 mg (total) plasmid DNA as template 
(prepared by mixing equal amounts of all four plasmids), 10 pmol of each primer, 0.5 mM of 
each dNTP, 2.5 mM MgCl2, and 5 U 7aq polymerase (Perkin Elmer) in 10 mM Tris-HCl, 50 
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niM KCl buffer, pH 8.5 PCR conditions were 95 X for 3 minuntes and 100 cycles of 94 °C 
for 30 seconds and 58 ""C for 1 0 seconds The primers used for error-prone PCR and StEP were: 
5'-AATTCGAAGCTTATGGCCTCAGCACCTATCGGAAGC-3' (forward) |SEQ. ID. NO. 1| 
and 5'-CTTCCTTCTAGATTACTGAGTAACGCGAATCGT-3' (reverse) [SEQ. ID. NO. 2]. 



Error-prone PCR was carried out in a 100 /ul reaction mixture containing 10 ng plasmid 
DNA as template, 50 pmol of each primer, 0.2 mM of each dNTP, 7 mM (generations Bl and 
B2) or 4 mM MgCl, (generations 33 and 84), and 5 U 7aq polymerase (Boehringer Mannheim ) 
in 10 mM Tris-HCl, 50 mM KCl buffer, pH 8.3. PCR conditions were as follows: 94 °C for 2 
minutes and 25 cycles of 94 °C for 30 seconds, 58 °C for 30 seconds and 72 °Cfor60 seconds. 
The primers used were: 

5*-TTGTTCCTGCGGCTGCAGCAATTGAACCG-3' (forward) [SEQ. ID. NO. 8] and 
5'-TGCCGGTCGACTCTAGATTACTGAGTAACG-3' (reverse) [SEQ. ID. NO. 9]. 

The percentage of inactive clones was between 30 and 40%. 

B. Screening of GAO Libraries 

GAG activity was screened in 96-well plates, using the methods of Approaches A and 
B, respectively, as described in Example 1(D). 

C. Laboratory Evolution of GAO 

The thermal stability curves of selected GAO variants are shown in FIG. 16. Variants 
were grown in test tubes (3 ml cultures). Following centrifiigation and resuspension of the cell 
pellets in NaPi buffer, pH 7.0 containing CUSO4, the cells were lysed. Aliquots of the cell 
extracts were heated at each temperature for 10 min and then cooled down on ice for 10 min 
before the residual activity toward D-galactose was determined at room temperature. 

Results of the laboratory evolution of GAO to increase activity and thermostability are 
listed in TABLE 4. T50 is an operational measure of stability and is defined as the temperature 
at which the enzyme loses 50% of its activity following incubation for a set time. 



2. Approach B 
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Wild type GAO (pGAO-036) was used as the parent for generation A 1 of GAO variants. After 
screening about 1500 clones, three mutants, 9 16.8D2, 9.16.6C11 and 9.16.16D12, were 
identified as more active toward aliyl alcohol and/or galactose. Clone 9. 16. 16D12, which was 
also more thermostable than wild-type GAO, was used to parent generation A2 of GAO variants. 
Four improved mutants were identified in this library following screening of about 1 500 clones: 
11.03 .6D3, 1 1 03 .10C3, 1 1 .03 . 10D6 and 1 1.03 .13E12. These clones were more active than the 
parent toward allyl alcohol and galactose. Clone 11.03.10C3 was substantially more 
thermostable than the parent, as well. These four improved variants were recombined by StEP 
in generation A3. Screening of about 2000 clones led to the identification of variant 1 .06.20E7 
which shows about a 200-fold increased activity toward allyl alcohol and D-galactose and 
exhibits about a 12 °C higher T^^ with respect to wild-type GAO. 

Wild-type GAO (pGAO-036) was used as the parent for generation B 1 of GAO variants. 
After screening about 900 clones, variant 1.D4 was identified as more active toward galactose 
and used to parent generation B2. Mutant 2.G4 was identified as more active toward galactose 
in this library following screening of about 1500 clones. Library B3 of GAO variants was 
generated using 2.G4 as the parent, and clone 3.H7 was identified as an improved variant after 
screening about 1 500 clones. Finally, library 4B was created using 3 . H7 as the parent and about 
1500 clones were screened. Variant 4.F12 was identified as about 15-fold more active toward 
galactose relative to wild-type GAO. 

D. Active and Thermostable Mutations 

Most beneficial mutations occur in domains II and III of the GAO gene (residues 156- 
532 and 533-639, respectively) (87). Mutation V494A, which was identified several times in the 
screen, is located at the bottom of the active site adjacent to the copper ligand Y495. Its 
presence increases the binding affinity for galactose approximately 3-fold. N535D is found in 
a solvent-exposed loop in domain III. The amino acid substitution G195E is largely responsible 
for the observed increase in thermostability of variant 1 .06.20E7 relative to wild-type. See FIG. 
16 and TABLE 4 
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It should aiso be noted that a large number of mutations (five in these experiments) 
resulted from the substitution of a neutral residue by a negatively charged residue. This tends 
to decrease the isoelectric point of GAO in the mutants (the pi of wild type GAO is 12). A 
decrease in pi is advantageous, in that it may lead to fewer interactions between the mutant GAO 
and other macromolecules, and lower adhesion to glass, It may also permit increased use of 
crude galactose oxidase preparations in organic synthesis (107). 



TABLE 4 

Mutations identified in GAO variants and their effects on GAO properties. 



GEN 


GAO name 


nucleotide base 
substitution 


amino acid 
substitution 


relative 
activity for 
allyl alcohol* 


relative 
D-galactose 


y ^) 


0 


pGAO-036 


N/A (WT) 


N/A (WT) 


1.0 


1.0 


42 


Al 


9. 16.8D2 




INJJ /U 


2.6 


4.6 




Al 


9.16.6C1] 


T1481C 
T1543A 


V494A 
C515S 


2.8 


1.3 




Al 


9.16.I6D12 


T1481C 
T408C 


V494A 
P136 


3.0 


4.9 


44 


A2 


11.03.6D3 


T1481C 
T408C 


V494A 
P136 




1 1 




A2 


11.03.10C3 


T1481C 
T408C 
G584A 
A9C 


V494A 
P136 
G195E 
A3 


3.8 


9.6 


54 


A2 


I1.03.10D6 


T1481C 

T408C 

A936G 

A1603G 

T654C 


V494A 

P136 

L312 

N535D 

T218 


5.4 


11 




A2 


11.03. 13E12 


T1481C 

T408C 

A208G 


V494A 

P136 

M70V 


5.1 


9.1 




A3 


1.06.20E7 


T1481C 

T28C 

T408C 

A208G 

G584A 

A1603G 


V494A 

Slop 

P136 
M70V 
G195E 
N535D 


20 


55 


54 


Bl 


LD4 


A1237G 


N413D 




2.4 
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DZ 




Ai2J7G 


N413D 






1 IcOUA 


S550 


B3 


3.H7 


AI237G 


N413D 






TI650A 


S550 






T1481C 


V494A 


B4 


4.FI2 


A1237G 


N413D 






T1650A 


S550 






T1481C 


V494A 






T1830A 


S610 



4.0 



86 



152 



*Allyl alcohol is oxidized by wild-type GAO at ca. 3% the rate of galactose oxidation. 

Mutations identified at residues A3, L3 12, T218, P136, S550 and S610 are synonymous 
and, without being bound by theory, the observed increase in activity is probably due to higher 
expression of GAO in E. coli. Given the low expression level of recombinant wild-type GAO 
(less than 3% of total intracellular protein as determined by SDS-PAGE), this is a much needed 
improvement. 

The variants identified also exhibit increased activity toward a variety of GAO substrates. 
Mutant 1 .06.20E7 is about 200-fold more active toward 3-pyridylcarbinol and mutant 4.F12 is 
about 15-foId more active toward glycerol, xylitol, beta-D-lactose, and IPTG. 

The sequences of representative mutants of the invention identified in TABLE 4 are 
shown in FIGS. 17-28. 



As shown in the above Examples, the galactose oxidase gene can be expressed in E coli 
in relatively high yield, with an increased activity toward at least one substrate. In certain 
embodiments the activity is greatly increased toward several substrates. In certain embodiments 
the mutants exhibit thermostability. 

The inducible promoters Plac or Ptac were effective for expression of the galactose 
oxidase gene and are preferred. Much higher expression may be possible when other strong 
promoters are used. However, some strong promoters may be counterproductive. For example, 
E. coli did not grow well when T7 promoter, which is stronger than lac promoter, was used for 
expression of the galactose oxidase gene. Double promoters of two Plac-Plac or Plac-Vtac were 
selected to express the galactose oxidase gene. Double promoters express the gene stronger 
than single promoter as compared pGAO-025 and pGAO-01 1 . Triple promoters expressed the 
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gene as well as double promoters Upper promoter of double promoters seemed to be less 
effective than lower promoter in the Examples Therefore, double promoters ofPlac-Plac or 
Plac-Piac are preferred. Induction of gene by IPTG was necessary when lac promoter or tac 
promoter was used. Timing of induction and incubation time after that were optimized. 
5 In these experiments the fused form of GAO (i.e. as a fusion protein with lacZ) was not 

found to provide advantages, and was not necessary to express the fungal gene. 

Galactose oxidase generally had reduced activity or lost its activity when codons were 
alternated or when it was produced as fused enzyme with His-tag. Culture condition was also 
important for production of the enzyme. 
1 0 Galactose oxidase was engineered by directed evolution to produce more active variants 

toward natural and additional substrates. Activity of the present mutants was as high as about 
65 times that of wild-type GAO. Mutants of the invention also are more stable than wild-type, 
and in particular exhibit improved thermal stability. 
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